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Abstract 

The algorithms of Pan (1995) [201 and Pan (2002) [22]) approximate the roots of a complex 
univariate polynomial in nearly optimal arithmetic and Boolean time but require a precision of 
computing that exceeds the degree of the polynomial. This causes numerical stability problems 
when the degree is large. We observe, however, that such a difficulty disappears at the initial 
stage of the algorithms, and in our present paper we extend this stage to root-finding within 
a nearly optimal arithmetic and Boolean complexity bounds provided that some mild initial 
isolation of the roots of the input polynomial has been ensured. Furthermore our algorithm 
is nearly optimal for the approximation of the roots isolated in a fixed disc, square or another 
region on the complex plane rather than all complex roots of a polynomial. Moreover the 
algorithm can be applied to a polynomial given by a black box for its evaluation (even if its 
coefficients are not known); it promises to be of practical value for polynomial root-finding and 
factorization, the latter task being of interest on its own right. We also provide a new support 
for a winding number algorithm, which enables extension of our progress to obtaining mild 
initial approximations to the roots. We conclude with summarizing our algorithms and their 
extension to the approximation of isolated multiple roots and root clusters. 
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1 Introduction 

The classical problem of univariate polynomial root-finding has been central in Mathematics and 
Computational Mathematics for about four millennia since the Sumerian times, and is still important 
for Signal and Image Processing, Control, Geometric Modeling, Computer Algebra, and Financial 
Mathematics. It is closely linked to the approximation of linear and nonlinear factors of a polynomial, 
which is also important on its own right because of the applications to the time series analysis, Weiner 
filtering, noise variance estimation, covariance matrix computation, and the study of multi-channel 
systems (see Wilson (1969) [3l], Box and Jenkins (1976) [6j, Barnett (1983) [T], Demeure and Mullis 
(1989 and 1990) [8], [9], Van Dooren (1994)) [32]. 

Solution of both problems within nearly optimal arithmetic and Boolean complexity bounds 
(up to polylogarithmic factors) have been obtained in Pan (1995) [20] and Pan (2002) [22], but 
the supporting algorithms require a precision of computing that exceeds the degree of the input 
polynomial, and this causes numerical stability problems when the degree is large. 

The most popular packages of numerical subroutines for complex polynomial root-finding, such 
as MPSolve 2000 (see Bini and Fiorentino (2000) [3]), EigenSolve 2001 (see Fortune (2002) [TU]1. and 
MPSolve 2012 (see Bini and Robol (2014) j5]) employ alternative root-finders based on functional it¬ 
erations (namely, Ehrlich-Abertli’s and WDK, that is, Weierstrass’, also known as Durand-Kerner’s) 
and the QR algorithm applied to eigen-solving for the companion matrix of the input polynomial. 
The user considers these root-finders practically superior by relying on the empirical data about 
their excellent convergence, even though these data have no formal support. To their disadvantage, 
these algorithms compute the roots of a polynomial in an isolated region of the complex plane not 
much faster than all its roots. 

We re-examine the subject, still assuming input polynomials with complex coefficients, and show 
that the cited deficiency of the algorithms of [20] and [22] disappears if we modify the initial stage 
of these algorithms and apply them under some mild assumptions about the initial isolation of the 
root sets of the input polynomial. Moreover, like the algorithms of [20] and [22] and unlike WDK 
and Ehrlich-Aberth’s algorithms, the resulting algorithms are nearly optimal for the approximation 
of the roots in an isolated region of the complex plane. 

Next we briefly comment on our results. In the next sections we elaborate upon them, deduce 
the computational cost estimates, and outline some natural extensions. 

Recall that polynomial root-finding iterations can be partitioned into two stages. At first a crude 
(although reasonably good) initial approximations to all roots or to a set of roots are relatively slowly 
computed. Then these approximations are refined faster by means of the same or distinct iterations. 

Our first algorithm applies at the second stage and is nearly optimal, under both arithmetic and 
Boolean complexity models and under mild initial isolation of every root, some roots or some root 
sets. Such an isolation can be observed at some stages of root approximation by Ehrlich-Aberth’s and 
WDK algorithms, but with no estimates for the computational cost of reaching isolation. Towards 
the solution with controlled computational cost, one can apply advanced variants of Quad-tree 
construction of Weyl 1924 [33], successively refined in Henrici 1974 [12], Renegar 1987 [28], and Pan 

2000 ED- 

The algorithm of the latter paper computes all roots of a polynomial or its roots in a fixed 
isolated region at a nearly optimal arithmetic cost, and it is nearly optimal for computing the initial 
isolation of the roots as well; the paper [21] does not estimate the Boolean cost of its algorithm, but 
most of its steps allow rather straightforward control of the precision of computing!]] 

Moreover in Section [5] we present a new winding number algorthm that computes the number of 
roots in a fixed square on a complex plane at a nearly optimal Boolean cost. By incorporating this 
algorithm into the refined variant of Weyl’s Quad tree construction of [21] . we extend our progress 

1 More recent variations of the Quad-tree algorithm have been studied by various authors under the name of 
subdivision algorithms for polynomial root-finding. 
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to obtaining mild initial isolation of all the roots of a polynomial as well as its roots isolated in a 
fixed region. This root-finder is performed at a nearly optimal Boolean cost and can be applied even 
where a polynomial is given by a black box for its evaluation. 

If properly implemented our algorithms have very good chances to become the user’s choice. 

Besides the root-finding applications, they can be valuable ingredients of the polynomial factor¬ 
ization algorithms. Recall that one can extend factorization to root-finding (see Sclionhage (1982) 
[291 . [20] . [22l . and the present paper), but also root-finding to factorization (see Pan (2012) [2311. 
Our algorithm can be technically linked to those of (29), [20], [22] ; our results (Theorem [8] and 
nnD could be also viewed as an extension of the recent record and nearly optimal bounds for the 
approximation of the real roots [26; 127], see also [nl- 

An interesting challenge is the design of a polynomial factorization algorithms that both are 
simple enough for practical implementation and support factorization at a nearly optimal computa¬ 
tional complexity. An efficient solution outlined in our last section combines our present algorithms 
with the one of Pan 2012 [22] and McNamee and Pan 2013 [112 Section 15.23], which is a simplified 
version of the efficient but very much involved algorithm of Kirrinnis (1998) [13) . 

Like our present algorithm, these solution algorithms are nearly optimal and remain nearly 
optimal when they are applied to a polynomial given by a black box for its evaluation, even when 
its coefficients are not known. 

Organization of the paper. We recall the relevant definitions and some basic results in the 
remainder of this section and in the next section. In Section [2] we present our main algorithm, prove 
its correctness, and estimate its arithmetic cost when it is applied to the approximation of a single 
root and d simple isolated roots of a dth degree polynomial. In Section [4] we extend our analysis to 
estimate the Boolean cost of these computations. In Section [5] we present our new winding number 
algorthm. In our concluding Section [G] we summarize our results and outline their extension to 
factorization of a polynomial and to root-finding in the cases of isolated multiple roots and root 
clusters. 

Some definitions. 

• For a polynomial u = u(x ) = the norms IMI7 denote the norms ||u || 7 of its 

coefficient vector u = (Ui)i= 0 , for 7 = 1 , 2 , 00 . 

• _D(A, r) denotes the complex disc {2 : \x — X\ < r}. 

• “ops” stands for “arithmetic operations”. 

• DFT(q) denotes the discrete Fourier transform at q points. It can be performed by using 
0{qlog(q)) ops. 

2 Isolation Ratio and Root-refinement 

The following concept of the isolation ratio is basic for us, as well as for [201 and [22]. Assume a real 
or complex polynomial 


d d 

P = p(x) = 'Y^PiX 1 = p n JJ(z - Zj), Pd^ 0 , ( 1 ) 

i=0 j—1 

of degree d, an annulus A{X, R, r) = {x : r < \x — X\ < R} on the complex plane with a center X, 
and the radii r and R of the boundary circles. Then the internal disc D(X,r) is R/r-isolated, and 
we call R/r its isolation ratio if the polynomial p has no roots in the annulus. The isolation ratios, 
for all discs D( 0, r) and for all positive r, can be approximated as long as we can approximate the 
root radii \zj\, for j = 1 ,,d. 
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The algorithms of [22] (cf. also Pan (2000) [23] and [22]) yield such approximations within a 
constant relative error, say, 0.01, by using 0(dlog 2 (d)) ops, but involve the Dandelin’s root-squaring 
iteration (see Householder (1959) [TT]1. and this leads to numerical stability problems. 

Alternative heuristic algorithms of Bini (1996) (21 and [3] are slightly faster, but also cannot 
produce close approximation without using root-squaring iteration. 

The Schur-Cohn test does not use these iterations and can be applied to estimate the isolation 
ratio more directly. For the disc 11(0,r), a variant of this test in Renegar (1987) [28] Section 7] 
amounts to performing FFT at d! = 2 h points, for 16d < d! < 32d, with the overhead of 0{n) ops 
and comparisons of real numbers with 0. This means a reasonably low precision of computing and 
Boolean cost. Also see Brunie and Picart (2000) [7j. 

The following result from Tilli (1998) [30] shows that Newton’s classical iteration converges 
quadratically to a single simple root of p if it is initiated at the center of a 3d-isolated disc that 
contains just this root. The result softens the restriction that s > 5 d 2 of [23 Corollary 4.5]. 

Theorem 1 . Suppose that both discs D(c,r) and D(c,r/s), for s > 3d, contain a single simple root 
a of a polynomial p = p(x) of degree d. Then Newton’s iteration 


x k +i =x k - p(x k )/p'(x k ), k = 0 , 1 ,... ( 2 ) 

converges quadratically to the root a right from the start provided that Xg = c. 


3 Increasing Crude Isolation Ratios of Polynomial Roots 

We can shift and scale the variable x, and so with no loss of generality we assume dealing with a 
(1 + ^-isolated disc 11(0, r), for r = 1/(1 + rj), a fixed rj > 0, and a polynomial p of |l]) having 
precisely k not necessarily distinct roots zi,...,Zk in this disc. 

In Section [ 6 ] we outline some important extensions of our current study based on our results in 
the general case of any k < d, which we produce next, but in Sections [3] and [4] we assume that k = 1. 

Now, under the above assumptions for any k < d, can we increase the isolation ratio, say, to 3d? 
We apply the following algorithm. 

Algorithm 2. The Power Sums. 

Input: three integers d, k and q, such that 0 < k < n, k < q, and 77 > 0, 
w = oj q = exp(27T\/— 1/q), a primitive gth root of unity, 
the coefficients pg,.. ■ ,Pd of a polynomial p = p{x) of H]). 

We write r = 1/(1 + 77 ) and assume that the disc D(0,r) 

(i) is (1 + ? 7 ) 2 -isolated and 

(ii) contains the k roots Z\,... ,Zk of the polynomial p = p(x) and no other roots. 

Output: the values such that 

Wg ~ <t*\ < A M = (r q+k + (d — l)r q ~ k )/(1 - r q ), for g = 1,.. ,,k, (3) 

k 

<Tg = Y^ z h for g = 1,..., k. (4) 

j-l 


Computations: 

1. Compute the coefficients of the two auxiliary polynomials p q (x) = Yl\=gPq,i xl an( f 

P q {x ) = Yli=oP_qA xl where p q<i = Y!j=oPi+jq and p q<i = for i = 0,...,q-l, 

l = \ d/q\ and 1= |_(d — 1 )/<?J. 
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2. Compute the values p q {uA), p q (ui J ), and rj = p q (u^ :l )/p q (^v :, ), for j = 0,1,..., q — 1, 

3. Compute and output the values a* = | Ej=o for g = 1,..., k. 

Theorem 3. Algorithm [H involves 3d + 0(q log(g)) ops. 

Proof. Stage 1 of the algorithm involves d — 1 multiplications and less than 2d additions, Stage 2 
amounts to performing two DFT(g) and q divisions, and Stage 3 amounts to performing an DFT(g) 
(because uj 9+1 , for g = 1,..., k. is the set of all gth roots of 1) and k divisions. □ 

In order to prove bound (f3|) . implying correctness of the algorithm , at first observe that p(uj J ) = 
p q {oj 3 ) and p'(u> 3 ) = p q (uJ 3 ), for j = 0,1,..., q — 1, and hence 

i i - 1 

CT* =-^w (fe+1) ' 7 p(w 7 )/p , (w 7 ), for g= 1,..., k. (5) 

q i=o 

The proof of bound dSJ also exploits the Laurent expansion 

d 1 oo oo oo 

p'{x)/p(x) =J2~— = + J2 a 9 x ~ 9 ~ 1 = Chxh ( 6 ) 

j —1 3 g =1 g —0 h=—oo 

where \x\ = 1, a 0 = 1, a g = E*=i Zj (cf. ®), a g = Ef=fe+i z i 9 > 9 = 1,2,..., that is, a' g is the fifth 
power sum of the roots of the reverse polynomial p lev (x) that lie in the disc D(0,r). The leftmost 
equation of © is verified by the differentiation of p(x) = p n rij=i( x ~ z j )■ The middle equation is 

implied by the decompositions = I^“ 0 (f) h and ^ (f) > for * > !> 

provided that \x\ = 1 for all i. (Note a link of these expressions with the following quadrature 
formulae for numerical integration, a g = 2 x ^-1 Jc(o i) xm p'( x )/p( x )dx, for g = 1,..., k.) 

In order to deduce bound ©, we next combine equations © and © and obtain 

+oo 

°k = XI C ~k-l+lq- 

l=— OO 

Moreover, equation ©, for h = —k — 1 and k > 1, implies that a /- = C-k-i , while the same equation, 
for h = k — 1 and k > 1, implies that a' k = —Ck-i- Consequently 

OO 

®k = ^ fc— 1 “1“ C—lq—k— 1)• 

Z=1 

We assumed in (O that 0 < fc < g — 1. It follows that c-i q -k -1 = crzg+fc and czq_/c_i = —for 
Z = 1,2,..., and we obtain 

oo 

~ ( 7 ) 

1=1 

Now recall that \ah\ < and |er^| < (d — l)z h , for h = 1, 2,... and 2 : = max^ =1 min(|zj|, l/\zj\), 
and so 2 < -j-b_ j n our case. Substitute these bounds into 0 and obtain 

Wl - *k| < (Z q+k + (d- l)z q ~ k )/{l - z q ). 

Therefore, bound © follows because z < r. 

By substituting q of order log(d) into bound ©, we can increase the isolation ratio of the disc 
D( 0,r) by a factor of gd h , for any pair of positive constants g and h. Therefore we obtain the 
following estimates. 
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Theorem 4. Suppose the disc D(0,r) = (x : |x| < r} is (1 + if) 2 -isolated, for (1 + rj)r = 1 and 
a fixed 11 > 0, and contains exactly k roots of a polynomial p = p(x) of degree d. Let g and h be a 
pair of positive constants. Then it is sufficient to perform 0(3d + log(d) log(log(<i))) ops in order to 
compute a gd h -isolated subdisc of D(0,r) containing exactly the same roots of p = p(x ). 

If k = 1, then o\ = z\, and by combining Theorems [T| and H] we obtain the following result. 

Corollary 5. Let a polynomial p(x) satisfy the assumptions of Theorem [/] for k = 1. Then we can 
approximate its root within e, for 0 < e < 1, by using 0(\og(d) log(log(d)) + dlog(log(l/e))) ops. 

Hereafter we write 

z + =max(\z 1 \,\z 2 \,---,\z d \). (8) 

Corollary 6. Suppose that we are given d discs, each containing a single simple root of a polynomial 
p = p(x) of degree d and each being (1 + rf) 2 -isolated, for a fixed g > 0. Then we can approximate all 
d roots of this polynomial within ez + , for z + of (0j and a fixed e, 0 < e < 1, by using 0(dlog 2 (d)(l + 
log(log(l/e)))) ops. 

Proof. Apply Algorithm [2] concurrently in all d given discs, but instead of the qth roots of unity use 
q equally spaced points at the boundary circle of each input disc (that is, dq = 0(d\ogd) points 
overall) and instead of DFT(g) apply the Moenck-Borodin algorithm for multipoint polynomial 
evaluation m- 

Also use it at the stage of performing concurrent Newton’s iteration initialized at the centers of 
the 3d-isolated subdiscs of the d input discs, each subdisc computed by the algorithm that supports 
Theorem [I] Here we work with the dth degree polynomial p rather than with the c/th degree 
polynomials p q . Indeed, in order to support transition to polynomials p q of degree q, for d discs, 
we would need to perform d shifts and scalings of the variable x, which would involve the order 
of d 2 log(d) ops, whereas by employing the Moenck-Borodin algorithm, we obtain a nearly optimal 
root-refiner, which involves 0(d) ops up to polylogarithmic factors. 

We replace the matrix fi = i n ]3]) by the matrix [c + u^ k+1 ^]j,k = c[l]yfc + fl where 

c is invariant in j and k. The multiplication of the new matrix by a vector v is still reduced to 
multiplication of the matrix Q by a vector v and to additional 3 d ops, for computing the vector 
c [l]j,fc v an d adding it to the vector flv. □ 

The Moenck-Borodin algorithm uses nearly linear arithmetic time, and m proved that this 
algorithm supports multipoint polynomial evaluation at a low Boolean cost as well (see also J. van 
der Hoeven (2008) [31] . Pan and Tsigaridas (2013a,b) [26], [27], Kobel and Sagraloff (2013) fl4j . 
Pan (2015) [23], and Pan (2015a) [25]). This immediately implies extension of our algorithm that 
support Corollary [6] to refining all simple isolated roots of a polynomial at a nearly optimal Boolean 
cost, but actually such an extension can be also obtained directly by using classical polynomial 
evaluation algorithm. 

Various other iterative root-refiners (see McNamee (2002) [15], McNamee (2007) [16], and Mc- 
Namee and Pan (2013) [Hj) applied instead of Newton’s also support our nearly optimal complexity 
estimates as long as isolation of the roots obtained by our power sum algorithm is sufficient in order to 
ensure subsequent superlinear convergence of the selected iterations. In particular Ehrlich-Aberth’s 
and WDK iterations converge globally to all roots with cubic and quadratic rate, respectively, if all 
the d discs have isolation ratios at least 3 \fd, for Ehrlich Aberth’s iterations, and 8d/3, for WDK 
iterations (cf. [30]). 

Remark 7. The algorithm of [21] and [25] (the latter paper provides more details) approximates 
a polynomial of degree d at 0(d ) points within ez + , for z + of ([5]) and fixed e such that 0 < e < 1, 
by using 0(dlog 2 (d)(l + log(l/e))) ops. This matches the bound of [T9], for 1/e of order gd h and 
for positive constants g and h. Moreover the algorithm of [24] and [25], performed with the IEEE 
standard double precision, routinely outputs close approximations to the values of the polynomial 
p(x) of degree d = 4096, at d selected points, whereas the algorithm of [19] routinely fails numerically, 
for d of about 40 (cf. [25]). 
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4 Boolean Cost Bounds 


Hereafter Ob denotes the bit (Boolean) complexity ignoring logarithmic factors. By lg(-) we denote 
the logarithm with base 2. To estimate the Boolean complexity of the algorithms supported by 
Corollaries[5]and[n]we apply some results from Pan and Tsigaridas (2013) [26] and Pan and Tsigaridas 
(2015) [27], which hold in the general case where the coefficients of the polynomials are known up 
to an arbitrary precision. In this section we assume that the polynomial p = p(x) has Gaussian 
(that is, complex integer) coefficients known exactly; the parameter A, to be specified in the sequel, 
should be considered as the working precision. We assume that we perform the computations using 
fixed point arithmetic. 

At first we consider the algorithm of approximating one complex root, z, of a polynomial p up 
to any desired precision £. By an approximation we mean absolute approximation, that is compute 
a 5 such that \z — z\ < 2~ l . We assume that the degree of p is d and that HpHoo < 2 T . 

Following the discussion that preceded Theorem [I] we compute the polynomial p q and then apply 
two DFTs, for p q and p q , and the inverse DFT, for p q /p q . 

Assume that p is given by its A-approximation p such that lg||p — p||oo < —A. Perform all the 
operations with p and keep track of the precision loss to estimate the precision of computations 
required in order to obtain the desired approximation. 

We compute p q using d additions. This results in a polynomial such that 

IglbJoo < T + lg(d) 


and 


lg(||p 9 -Pglloo) < -A + rlg(d) + 1/2lg 2 (d) + l/21g(d) = 0(-A + rlog(d) +log 2 (d)). 
Similar bounds hold for p ' q , that is, 

Ig(IKIIoo) < r + 21g(d) 

and 


HWp’q-p’qWoo) < -A + rlg(d) + 3/21g 2 (d) + l/21g(d) = 0(-A + r log(d) + log 2 (d)). 
The application of DFT on p' q leads us to the following bounds, 

bg(w’)| < r + 21g(d) + lglg(d) + 2= 0(t + log(d)) 


and 

K(^ 1 ) ~ Pqi ^)I < -A + t lg(2 d) + 3/2 lg 2 (d) + 5/2 lg(d) + lg lg(d) + 5 = 0(-A + r log(d) + log 2 (d)), 

for all i, m Lemma 16]. Similar bounds hold for p q [pj l ). 

The divisions ki = p q (Lo l )/p q (u} 1 ) output complex numbers such that 

N = b 9 (w l )/p' g (w*)| < r + 21gd + lglgd+2. 

Define their approximations ki such that 

lg(|fci - ki\) < -A + rlg(4d) + 3/2 lg 2 d + 9/2 lgd + 2 lg lg d + 11 = 0{— A + rlog(d) + log 2 {d)). 

The final DFT produces numbers such that the logarithms of their magnitudes are not greater 
than T + 21gd + 21glgd + 4 and the logarithms of their approximation errors are at most — A + 
r lg(8d) + 3/2lg 2 g? + 13/2lgd + 41glgd + 18 = 0(-A + r log(d) + log 2 (d)), [27[ Lemma 16]. 

To achieve an error within 2~ e in the final result, we perform all the computations with accuracy 
A = £ + r lg(8d) + 3/2 lg 2 d + 13/2 \gd + 4 lg lg + 18, that is A = 0(£ + rlog d + log 2 d) = 0(£ + t). 
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We perform d additions at the cost Os{d A) and perform the rest of computations, that is the 3 
DFTs, at the cost Os(log(d) loglog(d) p(X)) or Os(d(£ + r)) j27J Lemma 16]. If the root that we 
want to refine is not in the unit disc, then we replace r in our bounds with dr. 

We apply a similar analysis from [26J Section 2.3] to the Newton iteration (see also [27, Sec¬ 
tion 2.3]) and arrive at the same asymptotic bounds on the Boolean complexity. 

In [[26] and m the error bounds of Newton operator have been estimated by using the properties 
of real interval arithmetic. In this paper we perform our computation in the field of complex numbers, 
but this affects only the constants of interval arithmetic, and so asymptotically, both the error bounds 
and the complexity bounds of the Newton iterations are the same. Thus, the overall complexity is 
Os{d 2 T + d£) and the working precision is 0(dr + £). 

In our case we also assume the exact input, that is, assume the coefficients of the input poly¬ 
nomials known up to arbitrary precision; for example, they are integers. For the refinement of the 
root up to the precision of L bits, we arrive at an algorithm that supports the following complexity 
estimates. 

Theorem 8. Under the assumptions of Theorem [7] we can approximate the root z of the polynomial 
p(x) £ Z[ x ], which is of degree d and ||p||oo < 2 r , up to precision of L bits in Ob{ctt + dL). 

If we are interested in refining all complex roots, we cannot work anymore with the polynomial 
p q of degree q = 0(lg d) unless we add the cost of d shifts of the initial approximations to the origin. 
Instead we rely on fast algorithms for multipoint evaluation. Initially we evaluate the polynomial p 
of degree d at 0(d\gd) points, and we assume that lg ||p||oo < t. These d points approximate the 
roots of p , and so their magnitude is at most < 2 T . 

We use the following result of [27, Lemma 21]. Similar bounds appear in 13, HH [311 . 

Theorem 9 (Modular representation). Assume that we are given m + 1 polynomials, F £ C[x] of 
degree 2 mn and Pj £ C[x] of degree n, for j = 1 ,m such that ||-F||oo < 2 Tl and all roots of the 
polynomials Pj for all j have magnitude of at most 2 P . Furthermore assume A- approximations of F by 
F and of Pj by Pj such that ||.F — F||oo < 2 _A and \\Pj — Pj||oo < 2 _A . Let £ = A — 0(ti lg m + mn p). 
Then we can compute an £-approximations Fj of Fj = F mod Pj, for j = 1 such that 

||Tj — Fj ||oo <2~ e in C>B{mn(£ + n + mnp)). 

Using this theorem we bound the overall complexity of multipoint evaluation by Os(d(L + dr)). 
The same_bound holds at the stage where we perform Newton’s iteration. We need to apply Newton’s 
operator 0(1) for each root. Each application of the operators consists of two polynomial evaluations. 
We perform the evaluations simultaneously and apply Theorem [9] to bound the complexity. On 
similar estimates for the refinement of the real roots see m- 

We have the following theorem, which complements Corollary [5] with the Boolean complexity 
estimates. 

Theorem 10. Suppose that we are given d discs, each containing a single simple root of a polynomial 
p(x) £ Z[x\ of degree d and ||p||oo < 2 r , and each being (1 + rj) 2 -isolated, for a fixed p > 0. Then we 
can approximate all d roots of this polynomial within L bits in in Os(d 2 T + dL). 


5 Acceleration of Henrici-Renegar’s Winding Number Algo¬ 
rithm 

Assume that D(X, r ) := {x : | X — x\ < r} denote the disk of radius r > 0 with a center X and that 
no roots of a polynomial p(x) lie on its boundary circle C(X,r) = 8D{X,r) = {x : \X — x\ = r}. 
Let ujj = X + re V^i 2 nj/d ; anc j g0 co d n_ 1 denote the d! = 2 ^ log216 "1 equally spaced points on 

this circle. 

Step 1. Evaluate p(x) at wq, u)d n -i- 


Step 2. For each i, ’’label” p(wj) according to the scheme 


{ 1 if Re\p(x)\ > 0 and Im\p(x)\ > 0 

2 if Re[p(x)\ < 0 and Im[p(x)\ > 0 

3 if Re[p(x)\ < 0 and Im[p(x)\ < 0 

4 if Re[p(x)\ > 0 and Im[p(x)\ < 0, 


( 9 ) 


where Re[p(x)\ and Im[p(x)\ denote the real and imaginary parts of p{x). 

Step 3. If for some i , L[p(uji + 1 )] — L[p(i>Ji)\ = 2 mod 4, then terminate and write ’’FAILED” 
(using the definition cod' = wo). 

Step 4. Defining * : (a, b) >->■ {—1,0,1} to be the operation on ordered pairs of integers (a, b), 
a — b ^ 2 mod 4, where a* b is congruent to (a — b) mod 4, compute the integer 


# 


j CA JL 

4 L[p(pj i+1 )} - L[p{u)i)\. 

^ i=0 


Write ’’SUCCESS; 

This algorithm returns the winding number of a polynomial p(x) provided that no its roots are 
close to the boundary of the disk D{X,r). 

Lemma 11 ( |28j ). Assume that all roots of a a polynomial p{x) contained in D(X, 3r/2) are also 
contained in D(X,r/ 6). 

Then the above algorithm, applied to D(X,r), returns ’’SUCCESS; and # is the number of 
roots of p{x) in D(X,r). 

By using only the signs of the real and imaginary parts of polynomial evaluations at points 
wo, u>d n —i, Henrici-Renegar’s algorithm tracks the variation of the polynomial evaluation p{x) 
along the boundary of the disk D{X , r). but there are still numerical problems that must be resolved. 

(i) If the real or imaginary part of an evaluation p(tOj) is vanishing or is very close to 0 (even if 
|p(wj)| is not close to 0), then the signs of the values p{ojj) for some j can be difficult to determine. 

(ii) The number computed by the algorithm may not be the correct winding number of the curve. 

Inspired by [12] and [35], we provide a remedy to these problems by ’’tracking the location”, 

namely, we further divide the complex plane, and use the additional information in order to guarantee 
a stable output in spite of potential numerical errors. For the ease of application, we also changed 
the region from a disk to the region bounded by the square S(zo,r) with four vertices {zq + r + 
ry/—l, z 0 + r - ry 7 -!, z 0 — r + ryf^ T, z 0 - r - ry/^ T}. 

Modified Renegar’s Winding Number Algorithm 

Let p(z) be a polynomial of degree n. Let a,; = ^ for * = 0, n' = 2 L log 2 ( 64ra / 7r )J. Let L be 
an upper bound of the magnitude of the derivative | over [0,1]. 

Step 1. (Check for singularity.) Evaluate p(z) at ao,..., If any of these evaluation has 

absolute value less than \/2{r/2) n , then terminate and return ’’FAILED”. 

Step 2. (Check for correctness of the computations.) Check if the sum of absolute values of two 
consecutive evaluations is less than (8 rL)/n'. If any such a sum is less than this bound, terminate 
and return ’’FAILED”. 

Step 3. For each i, label p(ai) according to its argument: 

Wz)\ = (io) 

7r 

Since 0 < arg(p(x)) < 2w, L[f(x)\ returns an integer between 0 and 7. In other words, L[p{x)] 
marks the octant to which the evaluation belongs. 

Step 4. Define an operation on ordered pairs of integers: a * b := (a — b)(mod 8). We show 
in Lemma m that if there is no root of p(z) close to the boundary of the disk D(zo,r), then the 
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product L[p(oj+i)] * L\p(a,i)\ can only take a value in the set {—1, 0,1}. Lemma [T71 shows that the 
algorithm works even if each labelling is off by ± 1 . 

Step 5. Compute and return the integer 

n' — 1 

# = g L b( a i+1)] * L \p(ai)}- (11) 

8 »=o 


Then ff denotes the winding number of p(z) on the boundary of the square S(zo,r). 

Lemma 12. Let a = 1/2 + y/2 and j3 = 1/2. Assume that the only roots of p(z) contained in the 
disc D(zo,ar) are contained in the disc D(zo,(3r). Then the operation a*b in the algorithm above 
can only produce a value in the set {— 1 , 0 , 1 }. 


Proof. Define a parametrization of the square S(zo,r) as follows: 


7 (f) : [ 0 , 1 ] ¥ S(zo,r) 

! Zq + r + rv / —T — 8 rf 0 < t < \ 

zo + r - ryf-f - ( 8 t - 2)ryf-\ |<t<| 

Zq — r — ry/^1 + ( 8 t - 4)r t < t < § 
z 0 + r - r-v/^T + ( 8 f — 6 )r\/—T \ < t < 1 


Simply put, q(t) travels along S(zo,r) once at constant pace, such that 


I 7 K+ 1 ) - 7(«i)l = l a i+i - a i\ = 8r/n',i = 0 , - 1 . 


Next we show that under the hypothesis of the lemma, the argument of p(q(f)) cannot vary by more 
than 7 t/ 4 as t increases from cq to (q+i. Therefore L[p(wj+i)] * L[p(wj)] can only take a value in the 
set {— 1 , 0 , 1 }. 

Suppose otherwise, then for some U < t' < t" < U+ 1 , the argument of pipit/)) would differ from 
that ofp( 7 (t")) by 7 t/ 4. Let ^i, be the roots otp{z). Since arg{piz)) = argiz — £i)imod2n), 
for at least one root £, the argument of 7 (t') — £ would differ from that of 7 (t") — £ by more than 
7 r/ 4 n. Now let us show that this is impossible. Let dis denote the distance from £ to the line segment 
connecting y(t') and 7 it"). Under the hypothesis of the lemma, we have dis > r/2, 


Wrgijit') - £) - argi'yit") 


n _i At' — t 

= 2 tan x ( 


2 • dis 

ff I 


2 • dis 


01 


< 2 tan x ( 


8 r 


2n' ■ dis' 


<2 tan ( —) 
n 

<2 tan~ l i —) 
y 8n 

IT 

< — 

4 n 


Therefore the argument of pi'yft)) cannot vary by more than 7 r /4 as t increases from tj to 1 . 
This prove the lemma. □ 


As explained in the algorithm description, L[p(wj)] marks the octant to which p(w,;) belongs, and 
the value L[p(w i+ i)] * L[p(uq)] indicates the change of octant as i runs through 1,..., n'. The value is 
1 if the evaluation rotates counter-clockwise, -1 if the evaluation rotates clockwise, and 0 otherwise. 
Thus every full counter-clockwise cycle is indicated by an increase of ff by 1. The following lemma 
guarantees that when the evaluations are not small, the algorithm misses no turn of the curve. 
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Lemma 13. Let L be an upper bound of the magnitude of the derivative dp ^J^ \ over [0,1]. If 

b(7( a i))l > f or * = 0, then the integer returns the correct winding number of p(z) 

around the square S(zo,r). 

Proof. This lemma follows directly from [Lemma 3] |35j . □ 

Remark. We can decrease the gap between D(zo,ar) and D(zo,0r) by increasing n'. In fact, 
it can be seen from the proof that the gap can be decreased by half every time we double the value 
of n'. From now on we call the algorithm with number of samples doubled g- times as the modified 
winding number algorithm of degree g. 

Let us next estimate precision required in the above computations. What precision is enough 
for labelling the evaluations correctly? For the precise labelling we need to know the signs of both 
real and imaginary part of the evaluations, as well as the correct comparison of its values. This 
requirement is impractical as these values can be less than any working precision. However, the 

following discussion indicates that even if the labels are off by ±1, the algorithm still correctly 

returns the winding number. 

Definition 14. For each k = 0, let Lk denote the true label of the evaluation p(ak), and let 
Lfc denote the actual label of the evaluation. Note that L n i = Lq and L n j = Lq. 

The following discussion is based on the condition of Lemma [Inland the following assumption: 
Assumption^ For each k = 0, ...,tt/, the computation of its real and imaginary part is precise 
enough so that \Lf~ — Lk\ < 1. 

The assumption is satisfied if there is no complex roots near the square S(zo,r) because the 
absolute value of such an evaluation can be readily bounded. If neither real nor imaginary part is 
small enough to create confusion, then at least Lk points to the same quadrant as Lk- If one of 
them is indeed small, then the other one must be large enough so that the comparison between the 
real and the imaginary part is also clear. In particular, we can prove the following result: 

Lemma 15. Keep the assumptions of Lemma \12\ If the working precision 2~ b is greater than 
\/2{r/2) n , then for any z £ S(zo,r), at most one of the following events occurs: 

• \Re(p(z))\ < 2~ b 

• \Im(p(z))\ < 2~ b 

• | Re{p(z)) + Im(p(z))\ < 2~ b 

• I Re(p(z)) - Im(p(z ))| < 2~ b 

Proof. For z £ S(zo,r), the norm of p(z) can be bounded as follows: 

bO)l 

n 

=1 II(*-*)I 

j =i 

n 

=ni*-*i 

i=i 

n 

> n ? / 2 

1=1 

=(r/2) n 

>2~ b /V2 

Considering \p(z)\ 2 = \Re(p(z))\ 2 + \Im(p(z))\ 2 , the occurrence of any two of the four aforementioned 
conditions results in \p{z)\ < 2~ b , thus proving the lemma. □ 
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Lemma 16. For each k = 0, n!, L^+ \ * Lk ^ 4(mod 8) 

Proof. Lemma 1 121 essentially proves that Lk +1 * Lk = 0, ±1 (mod 8). Since Lk can only differ from 
Lfc by at most one, it can be easily seen that Lk+i * Lk ^ 4 (mod 8). □ 

Let efe = Lk * Lk (mod 8) denote the difference between the true label and the computed label. 
By assumption ek can be chosen such that | efc | < 1. Lemma 1 1 71 shows that the computation of the 
winding number is not be affected by such errors. 

Lemma 17. 

^ n'— 1 

#=g X! [ L k+l * Lk}. 

° fc=0 

Proof. For each k = 0,..., n', 

( Lk+i * L k ) — (Lfc+i * L k ) 

= ( Lk+i — Lk) — (Lfc+i — Lk) (mod 8) 

= ( Lk+i — Lk+ 1) — (Lk — Lk) (mod 8) 

= (Lk+i * Lfc+i) — (Lk — Lk) (mod 8) 

= tk+i - £fc (mod 8) 

Since both (Lk+ 1 * Lk) — (Lk+i * Lk) and Ck+i — ek belongs to {0, ±1, ±2}, we must have 

(Lk +i * L k ) — (Lfc+i * L k ) = tk+i — = 1, (12) 


Therefore 

^ n'— 1 

#-g X i L k+l * Lk] 

k—0 
^ n — 1 

= g [(Lk +1 * Lk) — ( Lk+i * L k )] 

k—0 
^ n — 1 

= g X [ efe+i ~ efe ] 

k =0 

= 6 n / — 6o 

= o. 

This completes the proof of Theorem [TT] □ 

The following theorem summarizes our results: 

Theorem 18. Let a = (l/2) g + v2 and f3 = 1 —(1/2) 9 . Assume that the only roots ofp(z) contained 
in D(zo : ar) are contained in D(zo,(3r) as well. Further assume that the working precision 2~ b is 
smaller than min{-\/2(r/2)", (8rL)/2 s L lo S2( 64n / 71 ')J Then the modified winding number algorithm of 
degree g will return the number of roots of p(z) in S(zo,r). 

Given the square S = S(zo,r), let r p denote the minimum distance between a root p(x) and 
this square. Then, by applying the modified winding number algorithm of degree g = 0(log(l/r p )) 
o S, we correctly calculate the number of roots in the region bounded by S. This requires n' = 
2 s\y°e. 2 i Mn /^)\ polynomial evaluations with a precision up to min{v / 2(?’/2) n , (8 rL)/n'}. Suppose that 
the cost of single polynomial evaluation with the precision 2~ b is E(n, b ), then the required precision 
b is asymptotically equal to 0(nlogLlog(l/r) + log(l/r p ) logn) and the overall evaluation cost is 
0(n(\/r p )E(n, b)). 
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5.1 Summary 

Now we summarize the key properties of the modified winding number algorithm of degree k: 

• If the algorithm succeeds on a square S = S(zo,r), then it returns the number of roots in S. 

• If there is no roots between D(zo , ](l/2) 9 + \/2]r) and D(z 0 , [1 — (l/2) 9 ]r), then the algorithm 
is guaranteed to succeed. 

• The algorithm requires polynomial evaluation with b = 0(n log L log(l/r) +log(l/r p ) log n)-bit 
of precision, and the bit-complexity is 0(n(l/T p )E(n, &)), where E(n,b) denotes the cost of 
one 6-bit polynomial evaluation. 

6 Conclusions 

Our polynomial root-finding consists of three stages. 

(i) At first, one computes some initial isolation of the roots or the clusters of the roots of a 
polynomial of a degree d by applying some known algorithms. One can apply Ehrlich-Aberth’s and 
WDK iterations, which have excellent empirical record of fast and reliable global convergence, but 
this record has not been supported by formal analysis, and the iterations are not efficient for the 
approximation of roots isolated in a disc or in a fixed region on the complex plane. One can avoid 
these deficiencies by applying advanced variants of Weyl’s Quad-tree, a.k.a. subdivision, algorithms. 
In particular a variant of |21j yields such an initial isolation at a nearly optimal arithmetic cost, 
and furthermore control of the precision of computing is rather straightforward at most of its steps. 
By incorporating our new accelerated version of winding number algorithm by Henrici-Renegar, 
we enable application of the entire construction within a nearly optimal cost bound even where a 
polyomial is given by a black box subroutine for its evaluation. 

(ii) Next one increases the isolation in order to facilitate the final stage. 

(iii) Finally, one computes close approximations to all roots or the roots in a fixed region by 
applying a simplified version of the algorithm of [13] presented in [23] and US Section 15.23]. For 
approximations to all roots one can instead apply Ehrlich Aberth’s or WDK iterations. 

Besides acceleration of winding numbe algorithm, we contribute at stage (ii). Namely, we measure 
the isolation of a root or a root cluster by the isolation ratio of its covering disc, that is, by the 
ratio of the distance from the center of the disc to the external roots and the disc radius. Then our 
algorithm increases the isolation ratio from any constant 1 + 77 , for a positive 77 , to gd h , for any pair 
of positive constants g and h. The algorithm is performed at nearly optimal arithmetic and Boolean 
cost and can be applied even when a polynomial is given by a black box for its evaluation, while its 
coefficients are unknown. 

If such an isolation is ensured, then Newton’s iterations of and ns Section 15.23] as well 

as Ehrlich-Aberth’s and WDK iterations converge right from this point with quadratic or cubic rate 
(cf. [3D]) and also perform at nearly optimal arithmetic and Boolean cost even where a polynomial 
is given by a black box for its evaluation. 

Our analysis can be readily extended to prove the same results even if we assume weaker initial 
isolation with the ratio as low as 1 + 77, for 77 of order l/log(n). Then we could move to stage (ii) 
after fewer iterations of stage (i). Further extension of our results to the case of 77 of order l/n h , for 
a positive constant h, is a theoretical challenge, but progress at stages (ii) and (iii) is important for 
both theory and practice. 

Unlike Ehrlich-Aberth’s and WDK iterations, Newton’s iterations of ns], m and [33 Section 
15.23] produce (as by-product) polynomial factorization, which is important on its own right (see 
the first paragraph of the introduction) and enable root-finding in the cases of multiple and clustered 
roots. Let us conclude with some comments on this subject. 

Our Theorem [3] covers the case where we are given a (1 + ^-isolated disc containing k roots 
of a polynomial p(x) of dH). In that case, at the cost of performing 0(d + log(d) log(log(d))) ops, 
Algorithm [2] outputs approximations to the power sums of these roots within the error bound A 
of order r 9-fc , for r = 1/(1 + 77) and positive integers q = 0{d) and k < q of our choice, say, q = 2k. 
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The algorithm can be extended to the case where all roots or root clusters of the polynomial p{x) 
are covered by s such (1 + 7?)-isolated discs. Then we can still approximate the power sums of the 
roots in all discs simultaneously by using 0(dlog 2 (d)) ops. 

At this point the known numerically stable algorithm for the transition from the power sums to 
the coefficients (cf. 0] Problem I-POWER-SUMS, pages 34-35]) approximates the coefficients of the 
s factors of pix) whose root sets are precisely the roots of p{x) lying in these s discs. The algorithm 
performs within dominated arithmetic and Boolean cost bounds. 

Having these factors computed, one can reduce root-finding for p{x) to root-finding for the s 
factors of smaller degrees. In particular, this would cover root-finding in the harder case where the 
discs contain multiple roots or root clusters. As this has been estimated in m and [23], such a divide 
and conquer approach can dramatically accelerate stage (iii), even versus the Ehrich-Aberth’s and 
WDK algorithms, except that the initialization of the respective algorithm requires a little closer 
approximation to the coefficients of the s factors of p{x) than we obtain from the power sums of 
their roots. 

The algorithm of m, extending the one of |29l , produces such a desired refinement of the output 
of Algorithm [2] at a sufficiently low asymptotic Boolean cost, but is quite involved. In particular, it 
reduces all operations with polynomials to multiplication of long integers. In our further work we 
will seek the same asymptotic complexity results at this stage, but with smaller overhead constants, 
based on the simplified version [23] and mi Section 15.23] of the algorithm of [13] . 
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